home *** CD-ROM | disk | FTP | other *** search
-
-
-
- PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111))))
-
-
-
- NNNNAAAAMMMMEEEE
- perlref - Perl references and nested data structures
-
- DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
- In Perl 4 it was difficult to represent complex data
- structures, because all references had to be symbolic, and
- even that was difficult to do when you wanted to refer to a
- variable rather than a symbol table entry. Perl 5 not only
- makes it easier to use symbolic references to variables, but
- lets you have "hard" references to any piece of data. Any
- scalar may hold a hard reference. Since arrays and hashes
- contain scalars, you can now easily build arrays of arrays,
- arrays of hashes, hashes of arrays, arrays of hashes of
- functions, and so on.
-
- Hard references are smart--they keep track of reference
- counts for you, automatically freeing the thing referred to
- when its reference count goes to zero. If that thing
- happens to be an object, the object is destructed. See the
- _p_e_r_l_o_b_j manpage for more about objects. (In a sense,
- everything in Perl is an object, but we usually reserve the
- word for references to objects that have been officially
- "blessed" into a class package.)
-
- A symbolic reference contains the name of a variable, just
- as a symbolic link in the filesystem merely contains the
- name of a file. The *glob notation is a kind of symbolic
- reference. Hard references are more like hard links in the
- file system: merely another way at getting at the same
- underlying object, irrespective of its name.
-
- "Hard" references are easy to use in Perl. There is just
- one overriding principle: Perl does no implicit referencing
- or dereferencing. When a scalar is holding a reference, it
- always behaves as a scalar. It doesn't magically start
- being an array or a hash unless you tell it so explicitly by
- dereferencing it.
-
- References can be constructed several ways.
-
- 1. By using the backslash operator on a variable,
- subroutine, or value. (This works much like the &
- (address-of) operator works in C.) Note that this
- typically creates _A_N_O_T_H_E_R reference to a variable, since
- there's already a reference to the variable in the
- symbol table. But the symbol table reference might go
- away, and you'll still have the reference that the
- backslash returned. Here are some examples:
-
-
-
-
-
-
-
- Page 1 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111))))
-
-
-
- $scalarref = \$foo;
- $arrayref = \@ARGV;
- $hashref = \%ENV;
- $coderef = \&handler;
-
-
- 2. A reference to an anonymous array can be constructed
- using square brackets:
-
- $arrayref = [1, 2, ['a', 'b', 'c']];
-
- Here we've constructed a reference to an anonymous array
- of three elements whose final element is itself
- reference to another anonymous array of three elements.
- (The multidimensional syntax described later can be used
- to access this. For example, after the above,
- $arrayref->[2][1] would have the value "b".)
-
- 3. A reference to an anonymous hash can be constructed
- using curly brackets:
-
- $hashref = {
- 'Adam' => 'Eve',
- 'Clyde' => 'Bonnie',
- };
-
- Anonymous hash and array constructors can be intermixed
- freely to produce as complicated a structure as you
- want. The multidimensional syntax described below works
- for these too. The values above are literals, but
- variables and expressions would work just as well,
- because assignment operators in Perl (even within
- _l_o_c_a_l() or _m_y()) are executable statements, not
- compile-time declarations.
-
- Because curly brackets (braces) are used for several
- other things including BLOCKs, you may occasionally have
- to disambiguate braces at the beginning of a statement
- by putting a + or a return in front so that Perl
- realizes the opening brace isn't starting a BLOCK. The
- economy and mnemonic value of using curlies is deemed
- worth this occasional extra hassle.
-
- For example, if you wanted a function to make a new hash
- and return a reference to it, you have these options:
-
- sub hashem { { @_ } } # silently wrong
- sub hashem { +{ @_ } } # ok
- sub hashem { return { @_ } } # ok
-
-
-
-
-
-
- Page 2 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111))))
-
-
-
- 4. A reference to an anonymous subroutine can be
- constructed by using sub without a subname:
-
- $coderef = sub { print "Boink!\n" };
-
- Note the presence of the semicolon. Except for the fact
- that the code inside isn't executed immediately, a sub
- {} is not so much a declaration as it is an operator,
- like do{} or eval{}. (However, no matter how many times
- you execute that line (unless you're in an eval("...")),
- $coderef will still have a reference to the _S_A_M_E
- anonymous subroutine.)
-
- For those who worry about these things, the current
- implementation uses shallow binding of _l_o_c_a_l()
- variables; _m_y() variables are not accessible. This
- precludes true closures. However, you can work around
- this with a run-time (rather than a compile-time)
- _e_v_a_l():
-
- {
- my $x = time;
- $coderef = eval "sub { \$x }";
- }
-
- Normally--if you'd used just sub{} or even eval{}--your
- unew sub would only have been able to access the global
- $x. But because you've used a run-time _e_v_a_l(), this
- will not only generate a brand new subroutine reference
- each time called, it will all grant access to the _m_y()
- variable lexically above it rather than the global one.
- The particular $x accessed will be different for each
- new sub you create. This mechanism yields deep binding
- of variables. (If you don't know what closures, deep
- binding, or shallow binding are, don't worry too much
- about it.)
-
- 5. References are often returned by special subroutines
- called constructors. Perl objects are just reference a
- special kind of object that happens to know which
- package it's associated with. Constructors are just
- special subroutines that know how to create that
- association. They do so by starting with an ordinary
- reference, and it remains an ordinary reference even
- while it's also being an object. Constructors are
- customarily named _n_e_w(), but don't have to be:
-
- $objref = new Doggie (Tail => 'short', Ears => 'long');
-
-
- 6. References of the appropriate type can spring into
- existence if you dereference them in a context that
-
-
-
- Page 3 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111))))
-
-
-
- assumes they exist. Since we haven't talked about
- dereferencing yet, we can't show you any examples yet.
-
- That's it for creating references. By now you're probably
- dying to know how to use references to get back to your
- long-lost data. There are several basic methods.
-
- 1. Anywhere you'd put an identifier as part of a variable
- or subroutine name, you can replace the identifier with
- a simple scalar variable containing a reference of the
- correct type:
-
- $bar = $$scalarref;
- push(@$arrayref, $filename);
- $$arrayref[0] = "January";
- $$hashref{"KEY"} = "VALUE";
- &$coderef(1,2,3);
-
- It's important to understand that we are specifically
- _N_O_T dereferencing $arrayref[0] or $hashref{"KEY"} there.
- The dereference of the scalar variable happens _B_E_F_O_R_E it
- does any key lookups. Anything more complicated than a
- simple scalar variable must use methods 2 or 3 below.
- However, a "simple scalar" includes an identifier that
- itself uses method 1 recursively. Therefore, the
- following prints "howdy".
-
- $refrefref = \\\"howdy";
- print $$$$refrefref;
-
-
- 2. Anywhere you'd put an identifier as part of a variable
- or subroutine name, you can replace the identifier with
- a BLOCK returning a reference of the correct type. In
- other words, the previous examples could be written like
- this:
-
- $bar = ${$scalarref};
- push(@{$arrayref}, $filename);
- ${$arrayref}[0] = "January";
- ${$hashref}{"KEY"} = "VALUE";
- &{$coderef}(1,2,3);
-
- Admittedly, it's a little silly to use the curlies in
- this case, but the BLOCK can contain any arbitrary
- expression, in particular, subscripted expressions:
-
- &{ $dispatch{$index} }(1,2,3); # call correct routine
-
- Because of being able to omit the curlies for the simple
- case of $$x, people often make the mistake of viewing
- the dereferencing symbols as proper operators, and
-
-
-
- Page 4 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111))))
-
-
-
- wonder about their precedence. If they were, though,
- you could use parens instead of braces. That's not the
- case. Consider the difference below; case 0 is a
- short-hand version of case 1, _N_O_T case 2:
-
- $$hashref{"KEY"} = "VALUE"; # CASE 0
- ${$hashref}{"KEY"} = "VALUE"; # CASE 1
- ${$hashref{"KEY"}} = "VALUE"; # CASE 2
- ${$hashref->{"KEY"}} = "VALUE"; # CASE 3
-
- Case 2 is also deceptive in that you're accessing a
- variable called %hashref, not dereferencing through
- $hashref to the hash it's presumably referencing. That
- would be case 3.
-
- 3. The case of individual array elements arises often
- enough that it gets cumbersome to use method 2. As a
- form of syntactic sugar, the two lines like that above
- can be written:
-
- $arrayref->[0] = "January";
- $hashref->{"KEY} = "VALUE";
-
- The left side of the array can be any expression
- returning a reference, including a previous dereference.
- Note that $array[$x] is _N_O_T the same thing as $array-
- >[$x] here:
-
- $array[$x]->{"foo"}->[0] = "January";
-
- This is one of the cases we mentioned earlier in which
- references could spring into existence when in an lvalue
- context. Before this statement, $array[$x] may have
- been undefined. If so, it's automatically defined with
- a hash reference so that we can look up {"foo"} in it.
- Likewise $array[$x]->{"foo"} will automatically get
- defined with an array reference so that we can look up
- [0] in it.
-
- One more thing here. The arrow is optional _B_E_T_W_E_E_N
- brackets subscripts, so you can shrink the above down to
-
- $array[$x]{"foo"}[0] = "January";
-
- Which, in the degenerate case of using only ordinary
- arrays, gives you multidimensional arrays just like C's:
-
- $score[$x][$y][$z] += 42;
-
- Well, okay, not entirely like C's arrays, actually. C
- doesn't know how to grow its arrays on demand. Perl
- does.
-
-
-
- Page 5 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111))))
-
-
-
- 4. If a reference happens to be a reference to an object,
- then there are probably methods to access the things
- referred to, and you should probably stick to those
- methods unless you're in the class package that defines
- the object's methods. In other words, be nice, and
- don't violate the object's encapsulation without a very
- good reason. Perl does not enforce encapsulation. We
- are not totalitarians here. We do expect some basic
- civility though.
-
- The _r_e_f() operator may be used to determine what type of
- thing the reference is pointing to. See the _p_e_r_l_f_u_n_c
- manpage.
-
- The _b_l_e_s_s() operator may be used to associate a reference
- with a package functioning as an object class. See the
- _p_e_r_l_o_b_j manpage.
-
- A type glob may be dereferenced the same way a reference
- can, since the dereference syntax always indicates the kind
- of reference desired. So ${*foo} and ${\$foo} both indicate
- the same scalar variable.
-
- Here's a trick for interpolating a subroutine call into a
- string:
-
- print "My sub returned ${\mysub(1,2,3)}\n";
-
- The way it works is that when the ${...} is seen in the
- double-quoted string, it's evaluated as a block. The block
- executes the call to mysub(1,2,3), and then takes a
- reference to that. So the whole block returns a reference
- to a scalar, which is then dereferenced by ${...} and stuck
- into the double-quoted string.
-
- SSSSyyyymmmmbbbboooolllliiiicccc rrrreeeeffffeeeerrrreeeennnncccceeeessss
-
- We said that references spring into existence as necessary
- if they are undefined, but we didn't say what happens if a
- value used as a reference is already defined, but _I_S_N'_T a
- hard reference. If you use it as a reference in this case,
- it'll be treated as a symbolic reference. That is, the
- value of the scalar is taken to be the _N_A_M_E of a variable,
- rather than a direct link to a (possibly) anonymous value.
-
- People frequently expect it to work like this. So it does.
-
-
-
-
-
-
-
-
-
- Page 6 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLRRRREEEEFFFF((((1111))))
-
-
-
- $name = "foo";
- $$name = 1; # Sets $foo
- ${$name} = 2; # Sets $foo
- ${$name x 2} = 3; # Sets $foofoo
- $name->[0] = 4; # Sets $foo[0]
- @$name = (); # Clears @foo
- &$name(); # Calls &foo() (as in Perl 4)
- $pack = "THAT";
- ${"${pack}::$name"} = 5; # Sets $THAT::foo without eval
-
- This is very powerful, and slightly dangerous, in that it's
- possible to intend (with the utmost sincerity) to use a hard
- reference, and accidentally use a symbolic reference
- instead. To protect against that, you can say
-
- use strict 'refs';
-
- and then only hard references will be allowed for the rest
- of the enclosing block. An inner block may countermand that
- with
-
- no strict 'refs';
-
- Only package variables are visible to symbolic references.
- Lexical variables (declared with _m_y()) aren't in a symbol
- table, and thus are invisible to this mechanism. For
- example:
-
- local($value) = 10;
- $ref = \$value;
- {
- my $value = 20;
- print $$ref;
- }
-
- This will still print 10, not 20. Remember that _l_o_c_a_l()
- affects package variables, which are all "global" to the
- package.
-
- FFFFuuuurrrrtttthhhheeeerrrr RRRReeeeaaaaddddiiiinnnngggg
-
- Besides the obvious documents, source code can be
- instructive. Some rather pathological examples of the use
- of references can be found in the _t/_o_p/_r_e_f._t regression test
- in the Perl source directory.
-
-
-
-
-
-
-
-
-
-
- Page 7 (printed 6/30/95)
-
-
-
-